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Abstract 

Managing a portfolio to a risk model can tilt the portfolio toward weaknesses of the model. 
As a result, the optimized portfolio acquires downside exposure to uncertainty in the model itself, 
what we call "second order risk." We propose a risk measure that accounts for this bias. Studies 
of real portfolios, in asset-by-asset and factor model contexts, demonstrate that second order 
risk contributes significantly to realized volatility, and that the proposed measure accurately 
forecasts the out-of-sample behavior of optimized portfolios. 
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1 Introduction 



Classical finance assumes the markets to be like a game of chance: Although future events are 
uncertain, the distribution of these events is known. We cannot predict how the dice will land, but 
we can calculate the odds of any given outcome with certainty. We can expect to roll snake-eyes 
on average one time in 36, and the rules of the game do not change without warning. 

Unfortunately, real financial markets do not behave like a game of chance: Market volatility 
is itself volatile; hot industries come and go; new companies are listed and others merge or go 
bankrupt. Under even the most generous assumptions, our estimates of financial risk are uncertain, 
based on limited historical observation, extrapolated forward. 

For a passively invested portfolio, the effect of such uncertainty is as likely to be good or bad. 
The total risk may be overforecast or under forecast, but taken on average these errors tend to wash 
out. On the contrary, an optimized portfolio is more likely to be hurt by uncertainty than helped 
by it. Constructing portfolios to minimize risk can make them safer, but at the cost of introducing 
an asymmetric exposure to "second order risk." 

In this paper, we explore a framework to quantify and forecast second order risk. Exploring 
only its mildest sources, we demonstrate that the act of optimizing a portfolio to a risk measure can 
render that measure systematically inaccurate. However, rather than abandon risk measurement 
or ignore its uncertainties, the framework shows that we may begin to account for second order 
risk as we do more familiar sources of uncertainty. 

To quote a former US Secretary of Defense: 

"There are known knowns. These are things we know that we know. There are known unknowns. 
That is to say, there are things that we know we don't know. But there are also unknown unknowns. 
There are things we don't know we don't know. " 

-Donald Rumsfeld, February 12, 2002 [1] 

Our aim is to bring some of the latter into the category of "known unknowns." Correcting for 
these uncertainties in general leads to a more conservative view of risk. However, there will always 
be weaknesses in our models and much we cannot anticipate [2]. Perhaps a fourth category of 
"unknown knowns" is the most dangerous: things we think we know, but don't. 
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1.1 A Toy Model 

To see the cause of second order risk, and how it can be forecast, consider the following toy example: 
Between two assets with the same expected return, an active manager aims to minimize risk by 
investing in the asset with the smaller standard deviation. In this example, investors are constrained 
to hold a single asset. After observing the returns of the assets, the manager finds Asset 1 to have 
a standard deviation of 8%, while Asset 2 has a standard deviation of 11%. Placing a bet on Asset 
1, the active manager believes the portfolio to have a risk of 8%. 

Although the manager doesn't know it, the returns of both assets are drawn from the same 
distribution, with standard deviation of 10%. The true risk is 10%, regardless of which asset was 
chosen, but the active manager's strategy is more likely to make investments whose risk happens 
to be underforecast. Meanwhile, passive investors are just as likely to hold either asset. Looking 
at the same data, a passive investor holding Asset 1 would underforecast risk, while an investor 
holding Asset 2 would overforecast risk, but with no bias toward either outcome. 

Figure [1] shows the result of repeating this experiment many times. In each trial, two time series 
are drawn from the same distribution and risk estimates are made. The active manager bets on 
the asset with lower risk forecast, while the passive investor always holds Asset 1. Noise diminishes 
the accuracy of both investors' risk forecasts, but it systematically biases only the active manager, 
whose average forecast is 8.7%, less than the true 10%. The wise active manager would correct 
risk forecasts upward, to compensate for the bias introduced by active management. Although the 
active manager does not know the true distribution of returns, we will see that it is possible to 
compensate for this bias. 

An intriguing implication is that the best risk forecast depends not just on the portfolio holdings, 
but also on the strategy. In the simulation, the two managers hold identical portfolios in half of the 
trials, and forecast risk based on identical returns. Nonetheless, because of differences of strategy, 
they have reason to make different risk forecasts, even when their portfolios exactly coincide. 

2 Model Uncertainty 

The example above is a case of aiming to maximize a utility function J7(w) for which we have only 
an approximate model C/(w). With perfect information, we would choose the variables w = w* to 
maximize C/(w), but instead we must choose some other w, the best guess given what is known. 
The difference between C/(w) and ?7(w) leads to some discrepancy between the true best w*, and 
the best guess w given the available information. 
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Active Investor: 8.7% 

— * — Passive Investor: 10.0% 
- - True Risk 




Risk Forecast (%) 

Figure 1: The distribution of risk forecasts of a toy model of active and passive investors. After 
observing the returns of two assets for ten periods, the active manager selects the asset with lower 
sample standard deviation, while a passive investor is equally likely to hold either asset. Although 
the true risk is 10% in all cases, the active manager consistently underforecasts risk. 
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As depicted in Figure [21 the effect of such a discrepancy is generically a loss: any departure 
Aw from w* reduces utility. For small errors Aw, the utility of w can be approximated 

[7[w] = U[w*] + AC/ ~ U[w*] + Aw'HAw, (1) 

where H is the Hessian of f7(w) at w*, the matrix of second derivatives. Simply because any 
function is concave at its maximum, H is a negative-definite matrix, and the correction Aw'HAw 
is negative for any Aw 7^ 0. 
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Figure 2: The schematic effect of errors in the model utility function. A model utility function 
usually has its maximum at least slightly removed from the true maximum, Aw. Though the 
realized w appears optimal to the model, it incurs a penalty AU under the true utility function. 

Using the model U (w) to forecast the utility of w misses the penalty Aw'HAw that is the 
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inevitable side-effect of having only an approximation. If we can calculate a distribution of Aw, 
we can account for the average loss ~ £'(Aw'HAw) that arises due to uncertainty in w*. 

For the utility functions of finance, these errors are compounded by a tendency for models U{w) 
to appear intrinsically better than the true C/(w)0 As a result, the point labeled Naive Best in 
Figure [2] is above even the True Best, attainable with perfect information utility U, and well above 
the True utility of w. 



2.1 Uncertainty and Active Management 

Classical portfolio theory [3] instructs the portfolio manager to build the optimal portfolio w from 
the covariance matrix Q, and vector a of expected excess returns. A variety of utility functions may 
be used, among them the Sharpe ratio 

Ui^) = -j=^- (2) 
V w'S w 

In the absence of constraints, the portfolio maximizing ([2|) has weights proportional to 

w* = n-^a. (3) 

However, even assuming the markets to be stationary and Gaussian, the covariance matrix fi must 
be estimated from observation of historical behavior, which introduces noisel^l 

Even if this noise level can be made relatively small, so that each element of is known with 
relative certainty, optimization tends to align the portfolio with the noise [4j, compounding many 
small errors into a large effect. As the number of observations T increases, the noise tends to be 
reduced by ~ l/T, and a good estimator can insure that these errors average to zero. 

The impact of this small amount of noise is nonetheless significant. For fi estimated directly 
from N assets, we will see that the effect of noise on the optimized portfolio does not average to 
zero, but yields corrections of order 

1 

(1 - N/T) ' 

^Physicists may recognize a relation to the tendency of quantum mechanical perturbations to systematically lower 
the ground state energy. 

^Another source of noise, uncertainty in a, may also be significant. This subjective uncertainty could be incorpo- 
rated into this framework, but we concentrate upon uncertainty in f2, taking a to be known to the investor. 
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growing without bound as the number of assets approaches the number of observations. Since T is 
limited by changing dynamics and market microstructure, this can lead to significantly inaccurate 
risk forecasts, and diminished out-of-sample performance. 

For a factor model of risk [5j, N/T is replaced by the milder K/T, where K is the number 
of systematic risk factors. This makes portfolio optimization among many assets more robust to 
estimation errors, but may leave significant corrections to risk forecasts. 

2.2 A Second Order Risk Measure 

The denominator of the Sharpe ratio ([2]) is the standard deviation S of future portfolio returns, a 
common measure of portfolio risk: 

= Er ((w'r - w'rflfl) = w'ftw. (4) 

Here Ex{f{x)\y) denotes an average over the variable x, conditional on y% If the true covariance 
matrix CI were known, Q would be a good measure of uncertainty, but in practice we must make 
do with an estimate Cl of the true distribution, based on observation. 

Relative to the hypothetical true covariance matrix, the estimate is a random variable. With 
Cl used in place of ft in Equation ([5]), the optimized portfolio 

w{Cl) = Cl-^a, (5) 

is also a random variable, distributed about the true optimal portfolio w*. 

The risk of w(ri) therefore arises from two contributions. In addition to the usual uncertainty 
of future returns r, there is a second risk associated with the randomness of the observation 
about ri, which is typically neglected. 

To account for the latter uncertainty, we define a risk measure by extending the expectation 
value of Equation to average over both ensembles: 

^lo = ^A,r((^'r-w'r)'|")- (6) 
Performing the average over the returns r given Ct we have 



^SO = E^[Er( fw'r - w'f? I n 



n 



'if there is no ambiguity, this may be denoted E{f{x)) or E{f), to avoid cluttering the notation. 
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or 



4o = E^{^'nw\n). (7) 

The final expression accounts for both the risk present in a given distribution and the additional 
risk due to distributional uncertainty. Although similar in appearance to it differs significantly 
in that it depends not on the portfolio holdings, but on the strategy that led to them, through 
w(r2). It is our aim to reliably estimate it. 

What is typically used to forecast risk, the "naive estimator" 

^laive = w'S^W, (8) 

may be significantly biased, even if the covariance matrix estimator f2 is unbiased, E^{Cl\fl) = ft. 
Active management induces a functional dependence w(n), a correlation between the portfolio and 
the estimation error in ft, so that 



or 



E^ij:'„\n)y^E'so. (10) 

The naive estimate of portfolio risk is typically lower than the true risk, and lower even than 
the optimal risk attainable with perfect knowledge of CI. Intuitively, the optimized portfolio tends 
to overweight assets with underforecast risk, and to underweight assets whose risk overestimates. 

The degree of this bias grows with the uncertainty in ft and the sensitivity of the portfolio to 
Cl, via w(S7). For a portfolio constructed independent of Cl, such as a passive index fund, the left 
and right of (jlOp are equal. 

We compare to the risk of the true, unknown optimal portfolio, w*. Any portfolio on the 
efficient frontier is the minimum risk portfolio under a fixed return constraint. For a minimum risk 
portfolio w subject to continuous constraints, we may formally expand the risk about w*(ri) as 
w = w* + Aw: 

E{wftw) = E {{w* + Aw)'n(w* + Aw)) 

= w*'Q,w* + 2w*'nE{Aw) + E{AwnAw) 

= w*'nw* + E (Aw'nAw) . (11) 
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The cross term w* fiAw vanishes not just in expectation but for any w satisfying the constraints, 
by the optimahty condition on w*. The final expression gives an intuitive decomposition of 
risk as that attainable with perfect knowledge of CI plus the cost of uncertainty. 

Note that E (AwTiAw) is positive, so the effect of the uncertainty is a risk penalty. Considering 
the portfolio w to be an estimator of the true optimal portfolio w*. Equation (jlip quantifies the 
risk cost of estimation error0 

Although we focus on the uncertainty due to estimation errors in Cl, the expected value in Equa- 
tion ([7|) may also be extended to other sources of uncertainty, such as stochastic time-dependence 
in CI and a. To quantify this behavior requires additional modeling assumptions, resulting in 
greater subjectivity, but the result is qualitatively the same: optimization produces an asymmetric 
downside exposure to model uncertainty. 



3 Asset Covariance Matrix 

We first explore second order risk in the context of the covariance matrix estimated directly from 
asset returns: 

n = ^rr'. (12) 

Here r is the N xT matrix of de-meanecj^l returns of N assets over T observation periods. Assuming 
Gaussian returns, Cl follows a Wishart distribution [6]. 

For the simple portfolio of Equation ([5]), the risk of ([7]) can be calculated explicitly. In terms 
of the observable w'flw, we find 

S|o = E{w'flw) ~ E{w'dw) (l-^^ . (13) 

Details of the calculation are given in the Appendix. 

The significance of Equation (|13|) is twofold: it demonstrates the scale of the bias, and imme- 
diately suggests how to correct it. Equation (fT3|) implies 

S|o ^ w'fiw (l - (14) 



^bounded below by the Cramer-Rao bound of statistics. 

^For our purposes, neglecting the ~ l/T estimation error of ex-post mean returns is a harmless simplifying 
assumption, unrelated to the difficult question of quantifying uncertainty in the forecast a. 
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is an unbiased estimator of the risk of the optimized portfoho: 

Cruciahy, the correction for second order risk is a function of N and T only, which are known to 
the investor without additional information about ft, making it possible to forecast second order 
risk. For an investment universe of 500 assets, and an asset covariance matrix estimated from 4 
years of daily returns. Equation (jl4p doubles the predicted standard deviation of portfolio returns. 

3.1 Empirical Results 

The simplicity of the Sharpe ratio optimized portfolio ([5]) aided in deriving the simple second order 
risk correction in Equation (fH|) . but the inflation factor (l — -y^) ^ can be a sufficient approximation 
to the correction needed for other utility functions. 

Figure [3] shows the results of a Monte Carlo simulation for the minimum risk portfolio, con- 
strained to be fully invested and to have fixed expected return w'a = R with respect to a randomly 
chosen a and fixed covariance matrix d. For each value of R, a new 0, is estimated from T = 100 
observations of = 50 returns. 

The curve labeled "True Frontier" is the efficient frontier that could be achieved if fi were known 
with certainty, corresponding to the True Best point in Figure [2j Risk along the true frontier is 
given by V'w*'flw*. 

The points labeled "Realized" show the actual risk V w'ilw of the optimized portfolios, which 
correspond to True in Figure El This risk is well above the optimal risk, showing that estimation 
error degrades performance by preventing the optimal hedging of risk. 

The "Naive Forecast" risk, \/ v/'Clw, is seen to be significantly over-optimistic, on average by a 
factor of two. Its location to the right of the true frontier is in correspondence with the position of 
the Naive Best point in Figure O overestimating not only the utility attainable with a model, but 
also what would be attainable with perfect information. 

In contrast, the "Corrected Forecast" V w'fiw (l — y) ^ accurately captures the risk of the 
optimized portfolio. Although its efficiency is diminished by noise, the corrected forecast provides 
unbiased estimates. 

Testing the methodology with real market data is complicated by the fact that the "true" 
covariance matrix is not known, so w'fiw must be estimated by observing realized volatility. In 
the context of portfolios, forecasts, and market conditions that are changing in time, the Bias 
Statistic is a useful tool for testing the accuracy of risk forecasts. 
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Figure 3: Monte Carlo simulation. The average Corrected Forecast is the same as the average 
Realized risk, below the true efficient frontier accessible only with perfect information about f2. 
The Naive Forecast appears better than even the true efficient frontier. 
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Figure 4: Average bias statistics of real portfolios with varying number of assets N and length 
of observation window T. A bias statistic greater (less) than 1 indicates under- (over-) forecast 
risk. "Naive" denotes the conventional risk forecast for the optimized portfolio, Equation ([5]), with 
random alpha vector among assets. The "Corrected" forecasts are for the same portfolios as 
Naive, but with forecasts corrected for second order risk. The bias statistics of random portfolios 
among the same assets are shown for comparison. 
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For a given time series of portfolios w^, the bias statistic is constructed from the forecast 
standard deviations T,t and out-of-sample reahzed returns Rt+i as 



An underforecast J^t produces bias statistics greater than 1, while overforecasts lead to i? < 1. 

We construct portfolios from a universe of the largest stocks in the United States, studying 
daily returns for the ten years through April 2009. To reduce the effects of extreme events, asset 
returns are trimmed to -50% and +80%. Assets with an incomplete history are excluded from the 
sample, leaving about 1800 stocks. 

For each trial, we construct portfolios among A'^ stocks selected at random. Optimized portfolios 
are constructed according to Equation ([5]), with A estimated from a rolling window of T days, and 
a random a vector. Each day, risk forecasts are constructed with the naive estimator ([8]) and 
corrected estimator (|14p . from which bias statistics are calculated. As a control, we also construct 
random portfolios from the same stocks, which are not subject to the bias of second order risk. 

Figure m shows the average bias statistics over 50 trials, for portfolios of A^=10, 25, 50 and 100 
assets, and observation windows T with T/N=1.5, 1.75, 2, 2.5, 3 and 4. In all cases, the standard 
risk forecast of optimized portfolios significantly underforecasts realized volatility, as indicated by 
bias statistics greater than 1. Comparison with the control demonstrates that the underforecast 
risk is a result of optimization, not some other feature of the distribution. 

Second order risk is therefore responsible for a significant portion of the out of sample portfolio 
volatility. In contrast, the corrected forecasts are nearly as accurate as the control, confirming the 
validity of the estimator 

4 Factor Process 

Section [3] assumed the returns generating process to be a multivariate Gaussian at the level of the 
assets, with no additional structure. As the resulting estimation error effects went like N/T, a 
large investment universe may require decades of data for robust optimization, far longer than the 
timescales over which market relationships are stable. 

Perhaps due to experience with such effects, most practitioners do not build large optimized 
portfolios from covariance matrices estimated directly from the assets, instead using more robust 
factor models. Rather than estimate the correlation among every pair of assets, a smaller number 
of systematic factors is identified, such as Value, Momentum, or industry membership. The return 




(15) 
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of eacli asset is decomposed into the K x T factor return f and the asset-specific, or idiosyncratic, 
return e. 

r = Xf + e. 

The N X K matrix X defines the model exposure of each asset to the K factors. It is assumed that 
all correlation among assets is due to their common exposure to the factors, so that the covariance 
matrix takes the form 

n = XFX' + A, (16) 

where the estimated factor covariance matrix is defined to be F = f • f'/r^ and the specific risk 
matrix A is assumed to be diagonal. For a broadly diversified portfolio of assets, the specific 
risk contribution is suppressed by a factor of 1/A^ relative to the factor risk, assuming the latter 
has not been hedged away. Because it is diagonal, estimation error in the specific risk matrix does 
not have the effects of covariance matrices, as an optimizer is not fooled into hedging out spurious 
correlations. For simplicity, we use the true A rather than an estimate, which omits corrections 
suppressed to order 1/A^ compared to the effects under study. 

Rather than estimate all N(N + l)/2 elements of the asset covariance matrix, the smaller 
K X K factor covariance matrix can be estimated from a much shorter time history, allowing a 
better reflection of the current market conditions. Provided the factor model accurately captures 
the returns process, it yields a far more robust estimate of the covariance among assets. 

4.1 Factor Modeling Errors 

Unfortunately, factor models may be subject to a variety of errors, which can degrade their per- 
formance. Similar to the asset-level case above, the factor covariance matrix has estimation errors 
due to the finite number of observations from which it is estimated. These errors are of order K/T, 
typically far better than the N/T behavior of errors of the asset covariance matrix, but significant. 

Unlike the asset-level case, factor models have an additional source of error in the factor expo- 
sures X, which is more difficult to quantify. In general, determining the exposures is an inexact 
science, requiring financial insight as much as straightforward econometric technique. Errors in the 
exposures, though difficult to measure, can skew the risk forecasts of portfolios, particularly those 
constructed to minimize exposure to systematic risk. 

®We again neglect ex-post mean returns, for simplicity of exposition. 
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Two classes of exposures errors may be identified, whicli we call "coherent" and "incoherent." 
Incoherent errors are those that are uncorrelated with the portfolio, so that their aggregate effect is 
diversified. For example, if the true exposures X differ from the model exposures by uncorrelated 
random noise, X = X + e, then the portfolio exposure error is approximately 

^Wiei ~ ^VNstd{e) 

i 

std{e) 

giving a contribution to variance ~ behaving like diversifiable specific risk rather than sys- 

tematic factor risk. The ^/N suppression is due to the low likelihood that many terms contribute 
with the same sign. 

A more dangerous class of errors, the coherent exposure errors, are those for which the portfolio 
is likely to accumulate a finite exposure. These errors can be generated by discrepancies between 
the alpha signal and the risk model, such as small differences in factor definitions. Similar to other 
errors we have seen, the optimized portfolio tends to align itself with these errors, resulting in an 
unsuppressed, non-diversifiable contribution 

^WiCi ~ ^Nstd{e) 

i 

~ std{e). 

For example, if a risk model defines a Value factor in terms of the Book-to-Price ratio, while the 
alpha Value factor uses Earnings-to-Price, the optimized portfolio may make large unintended bets 
[7] on the difference between these Value factor definitions. The contribution from this hidden 
exposure may be significant for a portfolio that has hedged away known factor exposures. 

To see how these coherent exposure errors occur, consider the estimated maximum Sharpe ratio 
portfolio. Equation Without loss of generality, we work in a basis of assets such that the 
specific risk is uniform, A = u^livlzl Furthermore, we choose a basis of model factors such that 
X'X = NIk- The factor of captures the scaling behavior of X'X with the number of assets. 

For the factor model of Equation p^ . it is useful to decompose a into components in the plane 
of the model exposures, and an orthogonal piece 

a = Xa-|-a_L, (17) 

^This can be achieved by taking en — > ona/ai^ri ^,ria / ai, Xi Xia/ai, which is the usual map from optimal 
regression weights proportional to to an equivalent OLS regression. 
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where a± = (1 — ^^)q! satisfies X'a_L = 0, and a = After some algebra, Equations ([5]), (fT6]) 
and ([HI) yield 

w = w(a± + cr^iV^^XF-^a) + . . . , (18) 



where w is an arbitrary constant. The . . . represent terms O y{j^F~ ) j, which are suppressed by 
a further factor of in a large universe of assets. 

Equation (jlSp shows the potential for distortions from a misalignment of the alpha factors and 
risk factors, with 0{N^^) suppression for the factor component of a relative to the component a± 
orthogonal to the plane of known factor risk. The model factor risk 

w'XFX w = a'F^^a (19) 

sees no systematic risk in a±. 

Unless the manager has extraordinary skill, a± may represent noise rather than an arbitrage 
opportunit}{§ (in the sense of the Arbitrage Pricing Theory), and optimization points the portfolio 
into a blind spot of the risk model. 

These coherent exposure errors are avoided if the factor exposures contain the full alpha signal, 

a = Xa. (20) 

If the alpha signal contains a component orthogonal to the plane spanned by the factor exposures, 
this component can be used to estimate an additional factor |8]: 

f("^) = (21) 

Although the resulting exposures may still contain errors, they are of the safe, incoherent variety, 
as the portfolio is unlikely to align along them. 

If it is not feasible to estimate an additional factor from the alpha signal, so that a is in the 
plane of model factor exposures, we do not know whether the orthogonal component a± represents 
an arbitrage opportunity, or just a limitation of the risk model. 

To see this, assume that there are true risk factors X, which differ from the model factors X by 

X = X + e. (22) 



Even managers using a stock-picking strategy, rather than making exphcit factor bets, may be judging assets on 
a handful of criteria that constitute risk factors. 
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We can assume without loss of generality that the noise is orthogonal to the model exposureJ§, 

X'e = 0. (23) 

A bit of algebra shows that the estimated factor returns f associated with X are related to the true 
factor returns f as 

f = f + X'e/^^- (24) 

Therefore, at large and T, the estimated factor covariance matrix is accurate, F ~ F, and the 
true factor risk of (fT8]) is related to the model, Equation (fT9]) . by 



w'XFX'w = w XFX'w + w"^ {a'^e¥e'ai_ + 2a'^a!e'ai_) . (25) 

The terms on the right represent a contribution to the factor risk that the model does not see, a 
form of second order risk. The noise e is unknown, but the orientation of the portfolio to the plane 
of model exposures provides insight into its magnitude. 

Assuming "no arbitrage" , that a lies fully in the plane of the true factors X 

a = Xa, (26) 

then a± = ea, and the remaining e dependence is of the form e'e, which in a large universe is 
sensitive only to the statistics of e, rather than its details. Imposing the further assumption that e 
arises from white noise, 

e'e ~ NpHk, (27) 



where p is an unknown noise parameter, we may solve for the total factor risk of (j25p 

w'XFX'w = w'XFX'w + (^^1^ + 2a\aA . (28) 



Like Equation (jl4p . the right-hand side forecasts second order risk using only information available 
to the investor. Though relying on the assumptions of Equations ()26p and (I27p . these corrections 
warn of the susceptibility to second order risk for a portfolio tilted far out of the plane of the model 
exposures. 



Since factor processes are invariant under an arbitrary change of basis in the space of factors, X XM, if 
X'e / 0, we can redefine X ^ X (x'x) ~ X'X, so that X'e = X'(X - X) = 0. 
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4.2 Factor Model Estimation Error 



If the model exposures contain a, as in Equation (j20j) . then the effects of modeling errors discussed 
in Section 14.11 are replaced by the milder effects of estimation error. 

We continue to assume as in Equations (j22p and (j23p that there is unknown noise in the 
exposures. Equation (j24p implies f = f , at large N. Unlike in Section 14.11 we do not assume the 
large T limit, so there is estimation error in the model factor covariance matrix F = With the 
assumption a± = 0, the true factor risk of the portfolio (jlSp is 

wXFXw = Af^^a'F^^X'XFX'XF^^a 

= a'F^^FF^^a, (29) 

while the naive estimate is 

w'XFX w = a'F-^a. (30) 
In close analogy with the asset-level case, taking expected values of Equations (j29p and (jSOp yields 

^so,f = (w'XFX w I F) = ^1 - Ep (^w XFX'w f) . (31) 



Therefore, the estimator 



±loj ^[^-tJ ^'^^^'^ (32) 

provides unbiased forecasts, without additional knowledge of the true factors. More complicated 
corrections hold beyond the N ^ oo limit. Despite the true factor risk depending on both the 
unknown exposures X and factor covariance matrix F, Equation (|32p forecasts second order risk 
using only observable quantities. 

While the ^ effects of Section [3] could overwhelm a large portfolio, these ^ more likely to 
be under control. However, for a typical factor model of IT ~ 50 factors and effective observation 
windo'vJ^ T ~ 200, the ~ 30% boost to forecast volatility is an important correction. 



4.3 Empirical Results 

To study the effects of factor model estimation errors, we consider the Barra Global Equity Model 
(GEM2) [9]. The model estimates a World factor, 34 industry factors, 55 country factors and 8 

^"See the discussion of effective observation window in Section [S] 
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style factors from an estimation universe based on the MSCI All Country World Index, consisting 
of about 8000 stocks. 

We consider an ensemble of 500 portfolios constructed from the estimation universe, optimized 
using ()18p relative to a = Xa, with random vector a0 To make contact with Equation (j32p 
we consider f2 given by Equation (jl6p . with F and A estimated with equally weighted covariance 
estimators over a rolling window of T = 156 weekly returns. We discuss the corrections necessary 
for the exponentially weighted, Newey-West estimator used in GEM2 in Section [5l To reduce the 
effect of extreme events, asset returns outside of (—80%, 400%) or exceeding ten cross-sectional 
standard deviations are dropped. 

As a control, we also construct 500 portfolios of the form w = Xb, with b a random vector. 
Though exposed to factor risk, these portfolios are not subject to the biased second order risk of 
optimized portfolios. 



Factor Model, T=156 Weeks 
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Figure 5: Average trailing bias statistics for ensembles of 500 optimized and random portfolios. 



The two-fold exact multicoUinearity of GEM2 is resolved by projecting to a smaller subspace of factors. 
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Each week, the risk is forecast using the standard risk estimator and the corrected estimator 
(I31|) . Bias Statistics (llSh are calculated with a standard deviation over a trailing 52 week window, 
over the decade ending March 2009. 

The results in Figure [5] are easily interpreted. The control has bias statistics greater than 
one during periods of upward-trending volatility, prior to 2000 and since late 2007. Similarly, 
the control has bias statistics consistently below one between 2001 and early 2007, during which 
volatility trended downward and the trailing window overforecast volatility of future returns. The 
deviation from one is exacerbated by the use of the OLS estimator for this study, which is less 
responsive to changing market conditions. 

The naive risk forecasts of the optimized portfolios have bias statistics significantly greater than 
one for the whole of the study, demonstrating that volatility is underforecast when second order 
risk is neglected. Since the risk forecasts are so much worse than those of the control, the error 
must be due to optimization, rather than the underlying factor model. An average bias statistic of 
about two indicates that the naive forecasts only capture about half the true risk. The risk due to 
distributional uncertainty therefore contributes fully half of the risk of these portfolios. 

In comparison, the corrected risk forecasts match the accuracy of the control for the length of 
the study. By accounting for second order risk in this way, it is possible to more accurately forecast 
the volatility of optimized portfolios. 

5 Discussion 

The techniques developed here begin to account for the costs of distributional uncertainty; however, 
they are not fully general, having made a number of simplifying assumptions to calculate the 
corrections to the risk forecast. 

Some of these assumptions may be relaxed without much difficulty. Rather than the equal- 
weighted covariance matrix estimators we have considered, it is common to use an exponentially 
weighted estimator that puts greater weight on recent events. For a half-life r we may account for 
these effects to leading order in K/t by replacing the number of observations T with an effective 
time window 

T 2T//n(2). (33) 
Similarly, the Newey-West estimator [lOj accounting for n > 1 lags of serial correlation can be 
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approximated with 



r^3r/2(n + i), 

and for a fat-tailed process with uniform kurtosis k, we may take 

T ^ 2T/{k - 1). (34) 

Note that Equation (j33|) accounts for the use of the EWMA estimator, but not the non-stationarity 
for which it is adopted. To account for the latter, the general expression ^ could be adapted to a 
model for the market process. 

Similarly, while the kurtosis correction (j34p approximately accounts for a uniformly kurtotic 
underlying process, it does not treat an optimization that attempts to minimize risk to extreme 
events with a utility function that penalizes assets according to their estimated tail behavior. We 
expect the effects of this paper to be amplified in the context of fat-tails optimization, due to the 
increased estimation error associated with rare, large events. 

Another generalization is to the case of constrained optimization. We find the second order risk 
forecasts to be robust to the inclusion of a small number of constraints, and linear constraints may be 
accounted for easily. However, it is difficult to extend analytic results to more general constraints, 
and suggest a heuristic based on the transfer coefficient [11], which measures the discrepancy 
between the constrained and unconstrained optimal portfolios. A Monte Carlo approach may also 
be useful. 

Other generalizations are more difficult. A portfolio manager might reduce a position based 
on a large forecast marginal contribution to risk, creating the correlations between the portfolio 
weights and 0, that lead to biases in the standard risk forecasts, but such an investment strategy 
is difficult to quantify. More difficult still are the true "unknown unknowns", whose effects - by 
definition - cannot be anticipated. 

On a larger scale, second order risk provides insight into the current financial crisis, in which a 
financial system optimized under models and assumptions of the economy finds itself with far more 
risk than it had accounted for. While our focus has been the risk of an optimized portfolio, there is 
likely a general principle at work. If something has been tuned to a particular measure, that measure 
is likely to become exaggerated. Similar distortions may be commonplace, with examples ranging 
from standardized test scores to the balance sheet information upon which executive compensation 
is based. 
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A Mathematical Details 

To establish (fT3]) . we use two identities of the Wishart distribution: 




N){T- N ~ i)(r -N - 3)]"^ n-'^ 




-3 



The approximations drop 0{1/N) and 0{1/T) terms, for simplicity. For the portfolio 



w 



comparison of the average naive forecast 



a'E{Cl~^)a, 



with the average true risk 



a'E{Cl-^ftd-^)a 




yields Equation ([T3]). 
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